Antoniak, M., & Mimno, D. (2018). Evaluating the
Stability of
Embedding-based
Word Similarities.
Transactions of the Association for Computational Linguistics,
6, 107–119.
https://doi.org/10.1162/tacl_a_00008
Arseniev-Koehler, A. (2022). Theoretical
Foundations and
Limits of
Word Embeddings:
What Types of
Meaning can
They Capture?
Sociological Methods & Research, 004912412211401.
https://doi.org/10.1177/00491241221140142
Bail, C. A. (n.d.).
Word Embeddings.
https://doi.org/10.1201/9781003093459
Benoit, K., Wang, H., & Watanabe, K. (n.d.).
Replication: Word embedding (gloVe/word2vec).
https://quanteda.io/articles/pkgdown/replication/text2vec.html
Blodgett, S. L., Barocas, S., Daumé Iii, H., & Wallach, H. (2020). Language (
Technology) is
Power:
A Critical Survey of
“Bias” in
NLP.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 5454–5476.
https://doi.org/10.18653/v1/2020.acl-main.485
Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017).
Enriching Word Vectors with Subword Information. arXiv.
http://arxiv.org/abs/1607.04606
Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to
Computer Programmer as
Woman is to
Homemaker?
Debiasing Word Embeddings. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, & R. Garnett (Eds.),
Advances in Neural Information Processing Systems (Vol. 29). Curran Associates, Inc.
https://proceedings.neurips.cc/paper_files/paper/2016/file/a486cd07e4ac3d270571622f4f316ec5-Paper.pdf
Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases.
Science,
356(6334), 183–186.
https://doi.org/10.1126/science.aal4230
Chung-hong Chan. (2023). Grafzahl: Fine-tuning
Transformers fortext data from within
R.
Computational Communication Research,
5(1), 76.
https://doi.org/10.5117/CCR2023.1.003.CHAN
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019).
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv.
http://arxiv.org/abs/1810.04805
Egami, N., Fong, C. J., Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). How to make causal inferences using texts.
Science Advances,
8(42), eabg2652.
https://doi.org/10.1126/sciadv.abg2652
Feder, A., Keith, K. A., Manzoor, E., Pryzant, R., Sridhar, D., Wood-Doughty, Z., Eisenstein, J., Grimmer, J., Reichart, R., Roberts, M. E., Stewart, B. M., Veitch, V., & Yang, D. (2022). Causal
Inference in
Natural Language Processing:
Estimation,
Prediction,
Interpretation and
Beyond.
Transactions of the Association for Computational Linguistics,
10, 1138–1158.
https://doi.org/10.1162/tacl_a_00511
Firth, J. R. (1975). Studies in Linguistic Analysis. Wiley-Blackwell.
Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018). Word embeddings quantify 100 years of gender and ethnic stereotypes.
Proceedings of the National Academy of Sciences,
115(16).
https://doi.org/10.1073/pnas.1720347115
Grimmer, J., Roberts, M. E., & Stewart, B. M. (2022). Text as data: A new framework for machine learning and the social sciences. Princeton University Press.
Grimmer, J., & Stewart, B. M. (2013). Text as
Data:
The Promise and
Pitfalls of
Automatic Content Analysis Methods for
Political Texts.
Political Analysis,
21(3), 267–297.
https://doi.org/10.1093/pan/mps028
Hamilton, W. L., Leskovec, J., & Jurafsky, D. (2016). Diachronic
Word Embeddings Reveal Statistical Laws of
Semantic Change.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1489–1501.
https://doi.org/10.18653/v1/P16-1141
Hvitfeldt, E., & Silge, J. (2022).
Supervised Machine Learning for Text Analysis in R. Acoompanying online tutorial, section 5. https://doi.org/10.1201/9781003093459
Jurafsky, D., & Martin, J. H. (2023).
Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition.
https://web.stanford.edu/~jurafsky/slp3/ed3book_jan72023.pdf
Jurriaan, N., & Gils, W. van. (2020).
NLP with R part 2: Training Word Embedding models and visualize results.
https://medium.com/cmotions/nlp-with-r-part-2-training-word-embedding-models-and-visualize-results-ae444043e234
Khodak, M., Saunshi, N., Liang, Y., Ma, T., Stewart, B., & Arora, S. (2018).
A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors.
https://doi.org/10.48550/ARXIV.1805.05388
Kjell, O., Giorgi, S., & Schwartz, H. A. (2023). The text-package:
An R-package for analyzing and visualizing human language using natural language processing and transformers.
Psychological Methods.
https://doi.org/10.1037/met0000542
Kozlowski, A. C., Taddy, M., & Evans, J. A. (2019). The
Geometry of
Culture:
Analyzing the
Meanings of
Class through
Word Embeddings.
American Sociological Review,
84(5), 905–949.
https://doi.org/10.1177/0003122419877135
Kroon, A. C., Trilling, D., Meer, T. G. L. A. van der, & Jonkman, J. G. F. (2019). Clouded reality: News representations of culturally close and distant ethnic outgroups.
Communications,
0(0).
https://doi.org/10.1515/commun-2019-2069
Le, Q., & Mikolov, T. (2014). Distributed
Representations of
Sentences and
Documents. In E. P. Xing & T. Jebara (Eds.),
Proceedings of the 31st International Conference on Machine Learning (Vol. 32, pp. 1188–1196). PMLR.
https://proceedings.mlr.press/v32/le14.html
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013).
Distributed Representations of Words and Phrases and their Compositionality. arXiv.
http://arxiv.org/abs/1310.4546
Müller, P., Chan, C.-H., Ludwig, K., Freudenthaler, R., & Wessler, H. (2023). Differential
Racism in the
News:
Using Semi-
Supervised Machine Learning to
Distinguish Explicit and
Implicit Stigmatization of
Ethnic and
Religious Groups in
Journalistic Discourse.
Political Communication,
40(4), 396–414.
https://doi.org/10.1080/10584609.2023.2193146
Pennington, J., Socher, R., & Manning, C. (2014). Glove:
Global Vectors for
Word Representation.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543.
https://doi.org/10.3115/v1/D14-1162
Roberts, M. E., Stewart, B. M., & Tingley, D. (2016). Navigating the
Local Modes of
Big Data:
The Case of
Topic Models. In R. M. Alvarez (Ed.),
Computational Social Science (pp. 51–97). Cambridge University Press.
https://doi.org/10.1017/CBO9781316257340.004
Rodman, E. (2020). A
Timely Intervention:
Tracking the
Changing Meanings of
Political Concepts with
Word Vectors.
Political Analysis,
28(1), 87–111.
https://doi.org/10.1017/pan.2019.23
Rodriguez, P. L., & Spirling, A. (2022). Word
Embeddings:
What Works,
What Doesn’t, and
How to
Tell the
Difference for
Applied Research.
The Journal of Politics,
84(1), 101–115.
https://doi.org/10.1086/715162
Rodriguez, P. L., Spirling, A., & Stewart, B. M. (2023). Embedding
Regression:
Models for
Context-
Specific Description and
Inference.
American Political Science Review, 1–20.
https://doi.org/10.1017/S0003055422001228
Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A
Primer in
BERTology:
What We Know About How BERT Works.
Transactions of the Association for Computational Linguistics,
8, 842–866.
https://doi.org/10.1162/tacl_a_00349
Rudkowsky, E., Haselmayer, M., Wastian, M., Jenny, M., Emrich, Š., & Sedlmair, M. (2018). More than
Bags of
Words:
Sentiment Analysis with
Word Embeddings.
Communication Methods and Measures,
12(2-3), 140–157.
https://doi.org/10.1080/19312458.2018.1455817
Schnabel, T., Labutov, I., Mimno, D., & Joachims, T. (2015). Evaluation methods for unsupervised word embeddings.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 298–307.
https://doi.org/10.18653/v1/D15-1036
Schweinberger, M. (2023). Semantic vector space models in r. The University of Queensland, Australia. School of Languages; Cultures.
Song, H., Tolochko, P., Eberl, J.-M., Eisele, O., Greussing, E., Heidenreich, T., Lind, F., Galyga, S., & Boomgaarden, H. G. (2020). In
Validations We Trust?
The Impact of
Imperfect Human Annotations as a
Gold Standard on the
Quality of
Validation of
Automated Content Analysis.
Political Communication,
37(4), 550–572.
https://doi.org/10.1080/10584609.2020.1723752
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017).
Attention Is All You Need.
https://doi.org/10.48550/ARXIV.1706.03762
Wendlandt, L., Kummerfeld, J. K., & Mihalcea, R. (2018). Factors
Influencing the
Surprising Instability of
Word Embeddings.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2092–2102.
https://doi.org/10.18653/v1/N18-1190
Wilkerson, J., & Casas, A. (2017). Large-
Scale Computerized Text Analysis in
Political Science:
Opportunities and
Challenges.
Annual Review of Political Science,
20(1), 529–544.
https://doi.org/10.1146/annurev-polisci-052615-025542
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., Platen, P. von, Ma, C., Jernite, Y., Plu, J., Xu, C., Scao, T. L., Gugger, S., … Rush, A. M. (2019).
HuggingFace’s Transformers: State-of-the-art Natural Language Processing.
https://doi.org/10.48550/ARXIV.1910.03771